Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl… by qianfengz · Pull Request #7256 · ROCm/rocm-libraries

qianfengz · 2026-05-09T16:27:15Z

The BlockDropout implementation already provides very complete logic for generating random numbers and executing dropout for the P tensor after first attention Gemm with capability to support both Warp-Gemm 32x32 and 16x16 as well as to run on both wave32 and wave64 arch.

But in some situation, we only need the block-layer process to generate random numbers, rather than simultaneously execute dropout in real-time on the vgpr tile. For example, xformers' test_mem_eff_attention.py::test_dropout_ck requires the host reference implementation of attention forward with dropout to use the same random numbers to compare & verify the device side implementation of attention forward with dropout, so a standalone kernel to generate random numbers only is required.

This PR will enable xformers's random_val generating kernel (in file ck_tiled_rand_uniform_kernel.h) to depend on BlockDropout's Run() operator completely to generate random numbers for a [MPerBlock, NPerBlock] tile during the tile iteration, no need to replicate the logic of BlockDropout in the xformers kernel

…ockDropout Run()

… coding style

=?UTF-8?q?Skip=20numeric=20drop-out=20when=20PComputeWind?= =?UTF-8?q?ow=20is=20a=20null=5Ftile=5Fwindow=20in=20Bl=E2=80=A6=20(#7256)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The BlockDropout implementation already provides very complete logic for generating random numbers and executing dropout for the P tensor after first attention Gemm with capability to support both Warp-Gemm 32x32 and 16x16 as well as to run on both wave32 and wave64 arch. But in some situation, we only need the block-layer process to generate random numbers, rather than simultaneously execute dropout in real-time on the vgpr tile. For example, xformers' `test_mem_eff_attention.py::test_dropout_ck` requires the host reference implementation of `attention forward with dropout` to use the same random numbers to compare & verify the device side implementation of `attention forward with dropout`, so a standalone kernel to generate random numbers only is required. This PR will enable xformers's random_val generating kernel (in file `ck_tiled_rand_uniform_kernel.h`) to depend on BlockDropout's `Run()` operator completely to generate random numbers for a `[MPerBlock, NPerBlock]` tile during the tile iteration, no need to replicate the logic of BlockDropout in the xformers kernel

Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl…

725e138

…ockDropout Run()

qianfengz requested a review from a team as a code owner May 9, 2026 16:27

qianfengz requested a review from ex-rzr May 9, 2026 16:27

github-actions Bot added the project: composablekernel label May 9, 2026

assistant-librarian Bot added the organization: ROCm label May 9, 2026

qianfengz requested review from asleepzzz and poyenc May 10, 2026 04:53

poyenc reviewed May 11, 2026

View reviewed changes

Comment thread projects/composablekernel/include/ck_tile/ops/fmha/block/block_dropout.hpp Outdated

Remove the semicolon following the constexpr() statement to align the…

c2eed01

… coding style

poyenc approved these changes May 11, 2026

View reviewed changes

asleepzzz approved these changes May 11, 2026

View reviewed changes

ex-rzr approved these changes May 12, 2026

View reviewed changes

qianfengz enabled auto-merge (squash) May 12, 2026 09:23

qianfengz merged commit 1fc20eb into develop May 13, 2026
43 of 48 checks passed

qianfengz deleted the users/qianfengz/ck/block_dropout_no_drop branch May 13, 2026 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl…#7256

Skip numeric drop-out when PComputeWindow is a null_tile_window in Bl…#7256
qianfengz merged 2 commits into
developfrom
users/qianfengz/ck/block_dropout_no_drop

qianfengz commented May 9, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

qianfengz commented May 9, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants